Enhanced sampling in generalized ensemble with large gap of sampling parameter: 

case study in temperature space random walk 
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We present an efficient sampling method for computing a partition function and accelerating 
configuration sampling. The method performs a random walk in the A space, with A being any 
thermodynamic variable that characterizes a canonical ensemble such as the reciprocal temperature 
P or any variable that the Hamiltonian explicitly depends on. The partition function is determined 
by minimizing the difference of the thermal conjugates of A (the energy in the case of A = 0), 
defined as the difference between the value from the dynamically updated derivatives of the partition 
function and the value directly measured from simulation. Higher-order derivatives of the partition 
function are included to enhance the Brownian motion in the A space. The method is much less 
sensitive to the system size, and the size of A window than other methods. On the two dimensional 
Ising model, it is shown that the method asymptotically converges the partition function, and the 
error of the logarithm of the partition function is much smaller than the algorithm using the Wang- 
Landau recursive scheme. The method is also applied to off-lattice model proteins, the AB models, 
in which cases many low energy states are found in different models. 



I. INTRODUCTION 

In a canonical ensemble, the partition function is de- 
fined as a sum over configurations (denoted by X), 



Z(A) = ^exp[-W A (X)] ; 



(1) 



x 



where Tt\{X) is the reduced Hamiltonian of the system 
with a dependence on a variable A. In the case that the 
A dependence only exists in the energy function E\[X), 
TL\{X) can be written as (3E\{X), where [3 = 1/T is the 
reciprocal temperature. However, A can be the tempera- 
ture /?, leaving the energy function E(X) independent of 
[3. 

A quantity of particular interest is the ratio of the 
partition function Z(Xi)/Z(Xq) at two given A's, Ai and 
Ao (if A does not explicitly involve the temperature, the 
ratio can be translated to the free energy difference as 
AF = -Tln[Z(Ai)/Z(A )]). In our previous study [l[, 
the partition function is computed in an expanded ensem- 
ble, where a regular simulation is coupled with random 
transitions among different A's, e.g., temperatures /3's or 
volumes Vs. This approach requires two neighboring 
distributions of the corresponding macroscopic quanti- 
ties, such as the energy for the temperature or the virial 
for the volume, to overlap sufficiently. Accordingly the 
spacing A A is proportional to and the number of 

A sampling points increases as yN, as the system size 
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TV grows. Many other methods, such as replica exchange 
, simulated tempering H , and others 0,11] , has similar 
issures. 

To overcome this problem of increasingly large number 
of sampling points, we define A as a continuous variable 
instead of a discrete one. The partition function Z{X) 
is characterized as a continuous function by a few ad- 
justable parameters (e.g., derivatives with respect to A) 
for a large A window. In this way, the number of A win- 
dows can be significantly reduced, and one can handle a 
much larger system. Moreover, the ratio of the partition 
function at endpoints (i.e., boundaries of a A window), 
Z(\i)/Z(\q), can be asymptotically determined through 
a simulation. 

In this paper, we first present the theoretical back- 
ground of the method in a general framework. Next 
the simulation protocol is exemplified in a special case 
where A is the reciprocal temperature (3. We then nu- 
merically test the performance of the method on the two- 
dimensional Ising model, and apply the method to fold- 
ing model proteins. At the end, we conclude the method 
with discussions. 



II. METHOD 

A. General Theory 

We start by constructing a generalized ensemble com- 
posed of canonical ensembles of a continuous A range. 
The probability of A being in the interval (A, A + dX) is 



2 



given by 

w(X)d\ = w(\)^^ d\ 



Z{\) 

^w(A) exp\-H x (X) -\nZ(\) 



x 



dX, 



(2) 



where we have introduced an approximate partition func- 
tion Z(X), as well as a predefined weight function w(X). 
In the second line, the partition function is expanded us- 
ing Eq. (fT]). If, in a special case, w(X) is a constant, the 
generalized ensemble corresponds to a flat A histogram 

An efficient sampling has three requirements. First, 
the method should yield the correct partition function ra- 
tio at the end points Z(Xi)/Z(Xq), or the free energy dif- 
ference. Second, the difference between the actual weight 
w(X) should be close to the desired one w(X), or equiv- 
alently Z(X) should be close to Z(X). Last, an efficient 
scheme for the A-space random walk is required. 

The first requirement can be rephrased to a condition 
on the average properties of the ensemble, 







g(Ai) Z(X ) 
£(Ai) Z(X ) 



d\nZ 
OX 



P(A) 



\ dX 



(3) 



dX 



Here, (■ ■ - ) A denotes a configuration average at a fixed 

A; P(A) = —dlnZ/dX is the derivative of the estimated 
partition function; P\(X) = dH.\(X)/dX is the thermal 
conjugate of A; (A)^ denotes a weighted average in the 
generalized ensemble for quantity A, 

<4^r^)^=/%A).5^A. 



w(X) 



Z(X) 



It is evident that as long as the two averages on the right 
hand side of the last line of Eq. ^ are equal, the re- 
quirement on the correct ratio of the partition function 
is satisfied. In simulation, (P(A)) A from the estimated 
partition function is dynamically adjusted to be equal to 

the measured average ( (P\(X)) ) (practically, the 

two-fold average is translated to the average in the gen- 
eralized ensemble, which is then replaced by a trajectory 
average measured along simulation). As a result, one 
can eventually obtain the correct ratio of the partition 
function. 

The second requirement is satisfied by minimizing the 
following quantity, 



S = 



P(X)-(P X (X)) > 



(4) 



Since P(A) and (Pa(A)) a are the derivatives of — In Z{X) 
and — \nZ{X) respectively, S is minimized when the two 
are equal at any A. However, a perfect match between 
P(A) and ( x P\(X)) x is practically impossible to reach. 

We thus adopt a variational approach, where P(A) is ap- 
proximated as a linear combination of a few trial func- 
tions 4>k(X) y s as 

P(A) =J2^kW- 
k 

An example of the expansion is a power series P(A) = 
ao + aiA + a^A 2 + . . ., where, <fik(X) — X k , and aj, 
corresponds to the fcth order derivative of P(A), ak = 
(1/&!) d^P/dX^K Now one can minimize S with re- 
spect to the coefficients a^'s as dS/dak = 0. This deter- 
mines etfc's from the following set of equations, 

£ a k Uj(X) fe (A)\ = ((f>j(X) (Px(X)) x )_. (5) 



Similar to the case of Eq. ([3]), the two- fold average 

\4>j(X) (P\(X)Y j is equivalent to the average of 

4>j {X)P\ (X) in the generalized ensemble, and can be eval- 
uated from a simulation trajectory. The parameters a^'s 
are regularly updated in simulation accordingly to Eq. ([5]) 
to enforce the minimization of S. Once all afc's are ob- 
tained, the ratio of the estimated partition function can 
be calculated as 



lnZ(Ai) -lnZ(Ao) 



Ai 



P(A) dX 



k 



k- 



(6) 

where = J^ 1 cj)k(X')dX'. We now show that Eq. 

(O is compatible with the first requirement Eq. ([3]): 
assuming (f)Q = 1, the first equation of Eq. ([5]), i.e. 

the j — case, becomes (PfA))^ = l^2 k ak(pk 

(P X (X)) \ , which is identical to Eq. ©. 

Last, sampling in the generalized ensemble can be im- 
plemented by a regular configurational sampling at a 
fixed A, as well as a random walk in the A space. Any 
constant temperature algorithm can be used to generate 
configurational moves at a fixed A. For the A-space sam- 
pling, a convenient choice is to follow a Langevin equa- 
tion: 



dX 
~dt 



1 

w 



P X (X)-P(X) 



+ 



■w 



(7) 



where £ is a Gaussian white noise that satisfies 

(£(*)£(*')) = 2S (t ~ *')) with 1 bein S the simulation 
time. The equation is derived by treating V\(X) = 

Tix(X) + lnZ(A) in Eq. © as the "potential", and 

its derivative - dV x (X)/dX = -\P\(X) - P(A)j as 

the "force" of the A-space random walk. To show that 
Eq. (J?]) yields the correct A distribution, we examine 
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the time evolution of the A distribution p(X), described 
by the corresponding Fokker-Planck equation, dp/dt = 
(d/dX)[(dV x /dX) (p/w)] + {d 2 /8X 2 )(p/w), whose sta- 
tionary solution (dp/dt = 0) indeed gives the desired 
distribution p ~ w exp\—V\(X)\ . 

B. A case study: temperature space sampling 

In the following discussion, A is assumed to be the 
reciprocal temperature /3. Thus, P\(X) — d[(3E(X)] /d(3 
m Eq. © is the energy E, and P(X) can be interpreted 
as the estimated average energy E((3) — —d\nZ/df3. 

The aim is to compute the ratio of the partition func- 
tion Z (f3\) / Z ((3q) . The ratio is to be calculated from 
the estimated average energy E((3). For this reason, we 
look for a best fit between estimated average energy E((3) 
and the actual one \E) p- If one expands the estimated 

average energy as E((3) = a + ai(3 + a 2 j3 2 + . . ., the 
task is to determine the best fitting coefficients dfc's. As 
in the general formalism, the trial functions are 0o = 
l,0i — (3,4>2 — /3 2 ,---, and the coefficients correspond 
to derivatives of the estimated partition function, e.g., 
a Q = E(0) = -dhxZ/d/3\„ =0 and oi = dE/d(3\„ =Q = 

-d 2 \nZ/df3 2 \ fj=Q ... . 

The simulation procedure is described as the follows. 
To be more specific, we assume that a third order expan- 
sion of E(j3) = a + ai/3 + a 2 /3 2 is used. The method has 
two components: a regular configurational sampling at 
a given temperature /3, and a random walk in the tem- 
perature space. For generating configurational moves at 
a temperature /3, the Metropolis algorithm or a constant 
temperature molecular dynamics method can be used. 
After a configurational step is finished, we accumulate av- 
erages for </3), </3 2 ), </3 3 ), </3 4 ), (E), (f3E) and (f3 2 E), 
where the first four values correspond to (<f>j(X) <j)i(\)} 

and the rest to (0 3 -(A) (P\{X)) ) in Eq. ©. Note, 

the symbol (■ • ■) here is a shorthand notation for a tra- 
jectory average; it corresponds to an ensemble average in 
the generalized ensemble, where the averaging over both 
configuration and (3 is implied. For simplicity, we have 
also assumed w((3) to be a constant and thus dropped 
the w subscript. 

The temperature space random walk is realized by as- 
suming (3 as a continuous variable within the temperature 
range of interest (/3q,/3i). Although one can divide the 
whole temperature range into several sub- windows, we 
assume only one window in this example for the sake of 
simplicity. The current temperature (3 is updated reg- 
ularly, i.e., every a few configurational sampling steps. 
Before a temperature update, the coefficients do, a\ and 
a 2 are determined by solving Eq. ([5]), or explicitly 

/ 1 (P) (/m (ao\ ( (E) \ 

((3) ((3 2 ) M U = (PE) . 

V (p 2 ) In W U V (pe) ) 



Then the temperature (3 is updated according to the 
Langevin equation Eq. ([7]), or explicitly 

dp/dt = (o + ax/3 + a 2 (3 2 ) -E + £, 

where E is the current energy; £ is a Gaussian white noise 
that satisfies = 25 (t — t'), which can be con- 

veniently generated using a random number generator. 
If the Langevin equation drives the current temperature 
out of the entire temperature range, the update is re- 
jected and the old temperature is preserved. At the end, 
the ratio of the estimated partition function can be cal- 
culated as, 

In Z (/3 ) - hi Z 

= a ({3i - I3 ) + aM - /3 2 )/2 + a 2 (f3f - /3 3 )/3. 

As the simulation progresses, the coefficients ao, a\ and 
a 2 gradually converge to fixed values, and the ratio of the 
estimated partition function asymptotically approaches 
the ratio of the correct one Z(/3 )/^(/3i) -> Z(/3 )/Z(j3i). 

Note, in this method, although cifc's correspond 
to different order derivatives of the partition func- 
tion, the determination of these parameters by Eq. 
([5]) does not involve averages of high-order mo- 
ments of P\(X) such as ([Px(X)] 2 ) or ([P X (X)] 3 ) , 
or averages of high-order derivatives of TL\(X) 
such as (d 2 H x {X)/dX 2 ) = -(dP x {X)/dx) and 

(d 3 Hx{X)/dX 3 ) = -(d 2 P x {X)/dX 2 ) . This is a de- 

sirable feature because these high-order quantities are 
usually difficult to compute or may not be well-defined. 
We naturally avoid these quantities by using moments of 
the variable A instead. In the above example, ao, a\ and 
a 2 are determined by averages, such as </3 3 ) and (/3 2 E), 
but not high-order moments of E, such as {(3E 3 ) and 
(f3E 2 y This feature makes the updating more robust 
and the method more applicable for a general A. 

III. NUMERICAL RESULTS 

A. Two-dimensional Ising model 

We first perform a test on the 32 x 32 Ising model 
using the first, second, and third order series expansion 
of E(j3). The Metropolis algorithm is used to generate 
configuration changes. The range of (3 is (0,0.25), the 
corresponding T range is (4, +oo). The time step for 
integrating the Langevin equation Eq. (J7J) is 5 x 10~ 5 . 
The results are shown in Fig. [1] First let us examine 
the /3-histogram, which corresponds to the /3-distribution 
w{(3) defined in Eq. ©. According to Eq. fl3]), the 
values of the /3-histogram at the endpoints should be 
equal, i.e., w((3q) = w(fli) [w(f3) is constant here and 
Z(/3 )/Z(/3o) = Z((3i)/Z(j3x) according to Eq. ©]. Fig. 
[IJa) agrees with the expectation in all the three cases. In 
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addition, we expect the approximate partition function 
to be sufficiently close to the exact one. The difference 
of the two can be examined from the difference between 
the actual weight w{(3) and the desired weight w{(3). 
Since w{(3) is a constant in this case, the /3-histogram 
that represents w((3) should be sufficiently flat. It can 
be seen from Fig. [lja), the first-order algorithm yields 
a /3-distribution peaked at both the boundaries (3 = 
and (3 = 0.25, while the third-order algorithm yields an 
almost flat histogram. For the energy histograms [Fig. 
[TJb)], since the corresponding energy distributions are 
very different at (3 = and (3 = 0.25, E((3) cannot be 
represented by a constant value, and thus the first order 
algorithm becomes ineffective (no overlap between two 
energy distributions at the given /3-gap). On the other 
hand, using a higher order version, E{(3) can more effec- 
tively approximate the average energy as a function of 
(3. As a result, we can achieve a flatter /3-histogram as 
well as a broader energy histogram. The example shows 
that higher order algorithms can handle a much larger 
temperature gap than the first order one. 

We also compare the current method with the method 
from a previous study W which uses the Wang-Landau 
(WL) updating scheme [6J] to converge the partition func- 
tion. In both cases, we update the temperature after a 
sweep of configuration sampling. For the current method, 
the third order expansion with a single temperature win- 
dow is used. For the previous method, twenty five sam- 
pling temperatures are evenly distributed in the temper- 
ature range with A/3 = 0.01. Such a fine temperature 
interval ensures that the previous method targets the 
flat-/3-histogram ensemble and it has a good transition 
rate between neighboring temperatures. The parameters 
of the WL updating scheme are the following. The ini- 
tial value for the modification factor In / = 1.0 and it is 
shrunk by a factor of 2 at the end of each stage. The 
criterion for terminating a stage depends on the flatness 
of the temperature histogram. Three different choices 
of the flatness thresholds 20%, 50% and 99% are used 
(in the last case, a stage is terminated when each sam- 
pling temperature is visited at least once). We also use a 
recipe 0] of improving the convergence of the WL updat- 
ing scheme in final stages, where the modification factor 
In / is specified as 1/ t n in final stages regardless of the 
histogram flatness. Here t n is defined as the number of 
Monte Carlo steps divided by the number of tempera- 
tures. 

The results of the comparison are shown in Fig. [2] 
Since the exact partition function for the Ising model is 
available @ , the logarithmic error of the ratio of the par- 
tition function, defined as e = | A\nZ — ALnZ|, is used 
to measure the accuracy. For the current method, the 
accuracy improves steadily as the simulation progresses. 
The error e as a function simulation sweeps (MC steps 
per site) t can be fitted by regression as e = 11.6 t~ ' 52 . 
It is clear that the original WL recursive scheme suffers 
from the problem of saturation at a long simulation time. 
Although the l/t n recipe eases the problem, it is still less 
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FIG. 1: (a) The /3-histogram using different orders of E((3) 
expansion. The right axis is for the first order (the solid line). 
The left axis is for the rest two (the dashed and dotted line), 
(b) The corresponding energy histograms using different or- 
ders expansions. For comparison, the constant temperature 
energy histograms (using the Metropolis algorithm) for (3 = 
and (3 = 0.25 are shown in the shaded area. 



efficient than the method introduced here. The error of 
the current method at the end of 10 5 sweeps is 0.0297, 
while for the WL recursion with l/t n recipe, the error 
is 0.121. The same accuracy e = 0.121 can be achieved 
by the current method at the end of 6 000 sweeps. This 
shows that the current method is one to two orders of 
magnitude more efficient than the WL recursion in terms 
of simulation time. 

The efficiency of the method can be further improved 
when parallel computers are available. We briefly de- 
scribe a parallel extension here. In the parallel version, 
multiple copies of simulations run simultaneously using 
a same set of et^'s. All copies contribute to the trajec- 
tory averages, such as (/3 2 ) and {(3E^. The parameters 
dfc's shared by all copies are calculated from the averages 
from multiple trajectories and therefore are more accu- 
rate. In Fig. [2j we also show that the result from the 
parallel version (dashed line) using four copies. The error 
at the end of 10 5 sweeps is 0.0156, which is about half of 
the single copy version. According to the i -1 / 2 scaling 
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FIG. 2: The logarithmic error of the ratio of the partition 
function e(ln[^(Ai)/lnZ(A )]) versus the simulation time t. 
For the WL updating scheme, three thresholds of the his- 
togram flatness 20%, 50%, and 99% are used. The result 
from the In / = l/t n correction (where t n is the number of 
MC steps divided by the number of temperatures) is shown as 
the dotted line. For the current method (solid line), the error 
e scales with the simulation time t as e ~ t~ ' 52 . By using the 
parallel version (four copies, dashed line), the error is further 
reduced. All results are averaged over 1000 independent runs. 



found low energy states. 

Due to this reason, we use a more aggressive averaging 
scheme that favors recent statistics. In this scheme, we 
introduce a memory factor 7 < 1 to gradually shrink the 
weight of previous statistics. For example, the average 
energy (i?) is computed as £ /Af, where the total energy 
£ and the total weight Af (which are both accumulated 
from the beginning of simulation) are updated as £ — > 
j£ + E and TV — * 7A/ + 1, where 7 < 1. If 7 = 1, 
the averaging scheme is reduced to a regular average. 
The factor 7 < 1 is particularly useful in correcting low 
temperature statistic^Sggnd in enhancing the temperature- 
space random ^hk. 1 However, 7 should still be close to 1 
to maintain a good sampling accuracy. In this example, 
7 = 1.0 — 10~ 7 is used. 

In implementation, Brownian dynamics is used for con- 
stant temperature simulation. The equation of motion 
is dx/dt = F + if , where F is the force derived from a 
molecular potential; ff is a vector of Gaussian white noise 
specified by {fli{t)rjj(t')) — 2T6(t~t')Sij; T is the current 
temperature. The time step is 3 x 10 -3 . For polymer of 34 
and 55 residues, the temperature range is (0.1,0.7). The 
method can easily locate the known lowest energy config- 
urations with E = —98.3571 and —178.1339 respectively 
[l[. No other lower energy state is discovered. 



relation, the convergence rate is about four times as fast 
as the single copy one, as we expected. By contrast, the 
WL type updating does not have a convenient parallel 
counterpart. 



B. Atomistic model protein 

The method is also applied to locate low energy states 
of a AB model protein M , which has been extensively 
studied in literature 

Q, llO, HH- It has two types of 
residues, A: hydrophobic, B: hydrophilic. Particularly, 
we used the second set of molecular force fields @ , which 
produces more globular low energy structures. 

The major challenge of this system is that it contains 
many different low-energy wells separated by high bar- 
riers. Due to its rugged low-energy landscape, the AB 
model serves as a stringent testing case for the ability of 
configurational sampling of the algorithm. Although in 
principle thermal properties are determined by averages 
from all low-energy wells, only the one with the lowest 
energy has a dominant contribution at a low tempera- 
ture. For example, if two energy wells have a AE = 2.0 
difference in their energy (which is common for a system 
with 55 or 89 residues), the contribution from the higher 
energy well is only exp(— AE/T) «2x 10~ 9 times that 
of the lower energy well at temperature T = 0.1. Since 
lower energy states are gradually discovered as the simu- 
lation proceeds, average properties estimated in an earlier 
time must be promptly corrected according to the newly 




(a) (b) 

FIG. 3: Panel (a): the lowest-energy configuration of 89 
residue model protein, E = —311.6137; Panel (b): a differ- 
ent configuration with a similar local minimal energy E — 
-311.5391 (black, A; white, B). 

The polymer of 89 residues is more challenging and 
its lowest energy configuration has not been reported 
in literature to authors' best knowledge. In this case, 
the temperature range is T — (0.15,0.4) with five win- 
dows, separated at T = 0.20, 0.23, 0.26, and 0.3. Within 
each window, a second order expansion in term of T, 
E = ciq + a\T is used. This expansion is more suit- 
able than the expansion in terms of (3 at a low tempera- 
ture because ao is roughly the ground state energy while 
a\ is the average heat capacity. The weighting factor 
w(T) dT ~ 1/ {l + [(T — T c )/A] 2 } dT concentrated on 
T c = 0.25 with a width A = 0.1 is used to accelerate 
the temperature-space random walk. In addition, since 
the goal is to find the ground state instead of calculat- 
ing the free energy, we replaced the weighted averages by 
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regular averages in Eq. ([5]) and remove the constraint 
Eq. © to make the averaging and updating process 
more stable. The lowest energy state found in this study 
is E = —311.6134. The corresponding configuration is 
shown in Fig. [3]^a). Here, another state with a very close 
energy E = —311.5391 but with a very different config- 
uration is also shown in Fig. [3fb). The result indicates 
an extreme ruggedness of the low energy landscape. 

IV. CONCLUDING DISCUSSIONS 

In summary, we demonstrated an enhanced sampling 
method using a generalized ensemble. The method com- 
putes the partition function by minimizing the difference 
between the derivative of the estimated partition func- 
tion and that of the actual one. One advantage of the 
method is that it allows a large gap of the macroscopic 
variable A of the partition function. For example, when 
A = /?, it can afford a much larger temperature gap than 
other tempering methods [l|, 0, H[ that rely on overlap 
between distributions. This feature makes the method 
more suitable for handling larger systems with much nar- 
rower distributions of thermodynamic quantities. The 
method also delivers asymptotic convergence of the par- 
tition function, which makes it superior to other methods 
based on the WL recursive scheme [6J. The efficiency of 
the method is demonstrated on the two dimensional Ising 
model and the off-lattice protein models. 

One of the most important features of our method is 
its scalability to large systems. The method performs 
a random walk in the A space, e.g., the reciprocal tem- 
perature as in the examples, and the ratio of parti- 
tion functions between two end-points of a A window is 
calculated by minimizing the difference between the esti- 
mated and measured (from simulation trajectory) values 



of the thermal conjugates of A (the energy in the case 
of A = (3). By including parameters that correspond to 
higher-order derivatives of the partition function in the 
Langevin equation controlling the A-space random walk, 
the profile of the partition function within the A window 
is more accurately approximated and the Brownian mo- 
tion in the A space is augmented. Thus, as long as the 
thermal conjugates varies smoothly within a A window, 
the method can bring an efficient sampling of the entire 
A window. Such a feature makes the method much less 
sensitive to the size of system, as well as the size of A 
window. 

In this study, we demonstrated the efficiency of the 
method in the reciprocal temperature space (A = /?). 
However, A can be other variables. In our previous study 
[U, we used the volume. Besides, it can be other A- 
parameter commonly used in free energy simulation 0, 

Strictly speaking, the current method does not sat- 
isfy detailed balance due to the use of runtime averages, 
as in other algorithms before convergence P, [H, 
However, as simulation progresses the correction to the 
existing averages continuously decreases, and the devia- 
tion from detailed balance is negligible in the asymptotic 
limit. On the other hand, the runtime averaging process 
is essential to continuously improve the estimate of the 
partition function. 
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